Website Overview Presentation
12/10/25
Premier League Data Visualization
Question
- How do teams differ in average goals scored at home in the 2021-22 season?
What I did
- Used match level data from the Premier League TidyTuesday dataset
- Grouped by home team and calculated average home goals
Premier League Takeaways
- Manchester City and Liverpool had the most average home goals per match
- Simple summaries already tell a useful story about performance
- The same approach could be reused for any league or season
Olympics Data Visualization
Question
- How do medal counts change over the years and what factors may affect this?
What I did
- Used Olympic medal data from the Olympics TidyTuesday dataset
- Wrangled the data to see how the total number of medals changed over time
- Created a plot to spot long term trends and interruptions in the Games
Olympic Trend Plot
Olympics Takeaways
- The overall trend climbs upward
- Growth tied to bigger programs, more events, and wider participation across countries
- More opportunities, especially in women’s competitions adds to the rise
- Dips / no data lines up with major world events / interruptions
- Gives good sense of how the Games have scaled and evolved
Kidz Bop Censored Lyrics Analysis
What Kidz Bop Censors Most
- Used the Kidz Bop censored-lyrics dataset from The Pudding to understand what kinds of content get changed most often
![]()
- Reflects the effort to keep songs upbeat and accessible for younger listeners
Relationship Themes Over Time
- Looked for lyrics containing relationship-related words, such as love, kiss, partner, girlfriend, boyfriend, and similar terms
- Calculated the share of lines each year that included any of those terms
Relationship Themes Plot
![]()
- Peak around 2015 match time frame when pop music leaned heavily into romantic themes
- Kidz Bop’s edits shift alongside those trends accordingly
SQL Traffic Stop Analysis
What I did
- Used traffic stop data from the Stanford Open Policing Roject
- Focused on Long Beach (CA), Mesa (AZ), and San Jose (CA)
- Used SQL to:
- Compare pedestrian vs vehicular stops over time
- Compare citation rates across race and city
Traffic and Pedestrian Stops Over Time
- Unioned the three cities’ data in order to group and filter it
- Counted stops by city-year-type restrictions
What We Can Take From This
- Vehicular stops dominate the totals in every city, level differs
- Trends are different
- Each department operates under its own patterns and volumes of activity
Citation Rates by Race Across Cities
- Used same three city tables: Long Beach, Mesa, and San Jose
- Restricted to years 2014-2016
- Wrangled data to compute citation rate as citations divided by total stops
- Question: do the citation rates mainly reflect differences across race, or differences in how each city records stop outcomes?
Citation Rates Plot
![]()
- Compares citation rates by race for Long Beach, Mesa, and San Jose between 2014 and 2016
What the Patterns Suggest
- Can potentially conclude that the datasets for Long Beach and Mesa may only include stops that led to a citation, or that non-citation outcomes were not consistently recorded or not recorded at all
- San Jose has much lower citation rates at around a quarter of stops
- Citation rates are driven not only by variation across cities, but also by differences in how the underlying agencies record their stops
Final Takeaways
- Small, focused analyses can reveal clear differences across cities and policing practices
- Public datasets like these make it possible to spot patterns that would otherwise stay hidden
- Simple analysis and data visualization can provide meaningful insights
References
- Premier League
- Olympics
- Kidz Bop
- Traffic Stops
- Research Paper
- Pierson, Emma, Camelia Simoiu, Jan Overgoor, Sam Corbett-Davies, Daniel Jenson, Amy Shoemaker, Vignesh Ramachandran, et al. 2020.
“A Large-Scale Analysis of Racial Disparities in Police Stops Across the United States.” Nature Human Behaviour.